Introduction and Methodology


For this section, we are testing and modeling the percent change of the area of glaciers in Glacier National Park with aspect as a factor. The methodology is:

Distribution of Data


The violin/box plot seems to indicate that an Eastern aspect leads to the smallest percent of area lost, while a Northern aspect leads to the next smallest percent of area lost, and a Northeastern aspect leads to the largest area lost, although the northeastern aspect seems to be distorted by a right-skewed (more larger values) distribution. There also may be some bimodality in the Northeastern distribution, which may mean that there is some unknown split in the Northeastern aspect.

Model Output


The model output segments the predicted area change by aspect, with:

The observed values and the 80% prediction interval make sense, with only one of the 10 values being outside the prediction interval and the values otherwise being dispersed within the intervals.

For the hypothesis, the p-value for the aspect belonging in the model was .0013, which means that we reject the null hypothesis that the aspect does not have a linear affect on the percent of area change.

Metrics


The linear model with an aspect as a factor has an RMSE (root mean squared error - a metric where smaller values are better) of 0.42, while a model with just the intercept term has an RMSE of 3.36

The graph of predicted outcome vs. the observed outcome should be roughly linear, which the data appears to be. The slight deviation is likely due to an extremely small test dataset because of the small number of overall observations.

Individual Conclusions (With Querying Glaciers)


The overall aspect of the area with glaciers appears to be NE/N, so the aspect of the glacier being the same as the overall aspect may lead to the larger decrease in area for Northeast, then North, then East. For applications, preventive measures that may be used to keep a glacier intact should be focused on the aspects with the largest expected area lost(NE, lesser extent N)

Overall Conclusions


Look at us, Go Dukes!!!

---
title: "Percent of Glacial Area Lost by Aspect, Glacier National Park"
output: 
  flexdashboard::flex_dashboard:
    source_code: embed
    storyboard: true
---

```{r setup, include=FALSE}
library(flexdashboard)
library(sf)
library(plotly)
library(raster)
library(leaflet)
library(dplyr)
library(tidyr)
library(ggplot2)
library(crosstalk)
```


```{r, include=FALSE}
# Read in Shapefiles
glac1966 <- st_read("C:/Users/Gus/Downloads/glacdates/GNPglaciers_1966.shp") %>%
  mutate(area = Area1966, YEAR = 1966) %>% dplyr::select(-Area1966)
glac1998 <- st_read("C:/Users/Gus/Downloads/glacdates/GNPglaciers_1998.shp") %>%
  mutate(area = Area1998, YEAR = 1998) %>% dplyr::select(-Area1998)
glac2005 <- st_read("C:/Users/Gus/Downloads/glacdates/GNPglaciers_2005.shp") %>%
  mutate(area = Area2005, YEAR = 2005) %>% dplyr::select(-Area2005)
glac2015 <- st_read("C:/Users/Gus/Downloads/glacdates/GNPglaciers_2015.shp") %>%
  mutate(area = Area2015, Year = 2015) %>% dplyr::select(-Area2015)
names(glac1966) <- tolower(names(glac1966))
names(glac1998) <- tolower(names(glac1998))
names(glac2005) <- tolower(names(glac2005))
names(glac2015) <- tolower(names(glac2015))
```

```{r, include=FALSE}
# Combine Shapefiles, Make them Readable
glac2015 <- glac2015[, names(glac2015) %in% c(names(glac1966), "area2015")]
glacover <- rbind(glac1966, glac1998, glac2005) %>% 
  mutate(recno = as.numeric(recno), x_coord = as.numeric(x_coord), 
         y_coord = as.numeric(y_coord), objectid = as.numeric(objectid))
glacover <- bind_rows(glacover, glac2015)
glacover <- glacover %>% st_transform(CRS("+proj=longlat + datum=WGS84"))
glacover <- glacover %>% arrange(recno) %>% filter(!(recno == 1118)) %>%
  mutate(recno = ifelse(recno == 1141, 1137, recno))
```

```{r, include=FALSE}
# Create a joint dataset for glacover
glacjoin <- read.csv("C:/Users/Gus/Documents/glacjoins.csv")
glacasp <- read.csv("./glacdata2.csv")
glacjoin$aspect <- as.character(glacasp$Aspect[c(1:13, 15:39)])
glacjoin <- glacjoin %>% mutate(aspect = ifelse(aspect == "SE", "E", aspect),
                                aspect = ifelse(recno == 1323, "E", aspect),
                                aspect = ifelse(recno == 1201, "NE", aspect),
                                aspect = ifelse(recno == 1555, "NE", aspect),
                                change_per = change_2015 * 100,
                                logchange = log(abs(change_per)))
glacover$aspect <- rep(glacjoin$aspect, each = 4)

glacsd <- crosstalk::SharedData$new(glacover)
glacasp <- glacasp %>%
  filter(!(Recno == 1141))
```

### Introduction and Methodology {data-commentary-width=400}

```{r}
# Creates a Nice Data Frame for the Graph
glacplot <- glacjoin %>% 
  dplyr::select(recno, aspect, tidyr::starts_with("change_"), -change_per) %>% 
  tidyr::pivot_longer(starts_with("change_")) %>%
  mutate(year = readr::parse_number(name), value = value * 100)
glacplot$maxelev66 <- rep(glacasp$Max.Elv.1966, each = 4)

# Creates an Overall Graph
ggplotly(ggplot(glacplot, aes(x = year, y = value, color = aspect)) + 
           geom_point(aes(text = paste("Year: ", year,
                                       "
", "Aspect: ", aspect, "
", "Percent of Area Lost: ", round(value, 2), "%", sep = ""))) + geom_smooth(method = "lm", se = FALSE) + labs(title = "Percent of Glacier Area Lost vs. year, grouped by aspect", y = "Percent of Glacier Area Lost"), tooltip = "text") ``` *** For this section, we are testing and modeling the percent change of the area of glaciers in Glacier National Park with aspect as a factor. The methodology is: - Read in and clean the data + Had different decisions about what aspects are, grouped SE with E - Perform exploratory data analysis and then transform the data accordingly + Perform a logarithmic transformation on the response variable - Create a training/testing dataset - Create a linear model - Obtain the fit and the prediction interval ### Distribution of Data {data-commentary-width=400} ```{r} # creates a violin plot with a boxplot inside of it plot_ly(glacjoin, x = ~as.factor(aspect), y = ~change_per, type = "violin", box = list(visible = TRUE)) %>% layout(xaxis = list(title = "Aspect"), yaxis = list(title = "Percent Change of Glacier Area")) ``` *** The violin/box plot seems to indicate that an Eastern aspect leads to the smallest percent of area lost, while a Northern aspect leads to the next smallest percent of area lost, and a Northeastern aspect leads to the largest area lost, although the northeastern aspect seems to be distorted by a right-skewed (more larger values) distribution. There also may be some bimodality in the Northeastern distribution, which may mean that there is some unknown split in the Northeastern aspect. ### Model Output {data-commentary-width=400} ```{r} # Create Splits set.seed(1234) glac_split <- rsample::initial_split(glacjoin, prop = .7, strata = aspect) glac_train <- rsample::training(glac_split) glac_test <- rsample::testing(glac_split) # Get the trained linear model lmtrain <- lm(logchange ~ as.factor(aspect), data = glac_train) # Predict Values from the linear model pred <- as.data.frame(-exp(predict(lmtrain, glac_test, se.fit = TRUE, interval = "prediction", level = .8)$fit)) pred_test <- cbind(glac_test, pred) # Plot the prediction interval and the test values ggplotly( ggplot(pred_test, aes(x = aspect, y = change_per)) + geom_point(size = 2, aes(text = paste( "Actual Percent of Original Area Lost: ", round(change_per, 2), "%", "
", "Prediction Bounds: (", round(lwr, 2), "%", ", ", round(upr, 2), "%", ")", sep = "")) ) + geom_errorbar(aes(ymin = lwr, ymax = upr), color = "blue", width = .3)+theme_minimal() + labs(title = "Actual Values vs. 80 % Prediction interval", x = "Aspect", y = "Percent of Original Area Lost"), tooltip = "text" ) ``` *** The model output segments the predicted area change by aspect, with: - East + The least variance in its prediction and has the lowest predicted percent loss of original area - North + The median variance in its prediction and has the median predicted percent loss of original area - Northeast + The largest variance in its prediction and has the largest predicted percent loss of original area The observed values and the 80% prediction interval make sense, with only one of the 10 values being outside the prediction interval and the values otherwise being dispersed within the intervals. For the hypothesis, the p-value for the aspect belonging in the model was .0013, which means that we reject the null hypothesis that the aspect does not have a linear affect on the percent of area change. ### Metrics {data-commentary-width=400} ```{r} # Plots the predicted vs. actual values, below the inline chunks ('r -') finds # the RMSE ggplotly(ggplot(pred_test, aes(x = log(-fit), y = log(-change_per))) + geom_point() + geom_smooth(method = "lm") + labs(title = "Predicted vs. Actual Percent of Area Lost", y = "Actual Percent of Area Lost (log scale)", x = "Predicted amount of Area Lost (log scale")) ``` *** The linear model with an aspect as a factor has an RMSE (root mean squared error - a metric where smaller values are better) of **`r round(ModelMetrics::rmse(log(-pred_test$change_per), log(-pred_test$fit)), 2)`**, while a model with just the intercept term has an RMSE of **`r round(ModelMetrics::rmse(log(-pred_test$change_per), mean(log(-pred_test$change_per))), 2)`** The graph of predicted outcome vs. the observed outcome should be roughly linear, which the data appears to be. The slight deviation is likely due to an extremely small test dataset because of the small number of overall observations. ### Individual Conclusions (With Querying Glaciers) {data-commentary-width=600} ```{r, warning=FALSE} # Plots the glaciers, below the 'r crosstalk' creates the controls plot_geo(glacsd) ``` *** `r crosstalk::filter_checkbox("Year", "Select the Year: ", glacsd, ~year)` `r crosstalk::filter_select("RECNO", "Select the recno: ", glacsd, ~recno)` `r crosstalk::filter_select("ASPECT", "Select the aspect: ", glacsd, ~aspect)` The overall aspect of the area with glaciers appears to be NE/N, so the aspect of the glacier being the same as the overall aspect may lead to the larger decrease in area for Northeast, then North, then East. For applications, preventive measures that may be used to keep a glacier intact should be focused on the aspects with the largest expected area lost(NE, lesser extent N) ### Overall Conclusions {data-commentary-wdith=400} ```{r} plot_ly(glacplot, x = ~year, y = ~maxelev66, z = ~value, split = ~aspect, type = "scatter3d", mode = "markers", hoverinfo = "text", text = ~paste( "Year: ", year, "
", "Aspect: ", aspect, "
", "Maximum Elevation in 1966: ", round(maxelev66, 2), "m", "
", "Percent of Area Lost since 1966: ", round(value, 2), "%", sep = "" )) %>% layout(title = "3D Scatter Plot of all of our Variables", scene = list(xaxis = list(title = "Year"), yaxis = list(title = "Maximum Elevation, 1966 (m)"), zaxis = list(title = "Percent of Area Change Since 1966"))) ``` *** Look at us, Go Dukes!!!